The Atari Compendium

home *** CD-ROM | disk | FTP | other *** search

/ The Atari Compendium / The Atari Compendium (Toad Computers) (1994).iso / files / prgtools / mint / filesy~1 / mfsdefrg.zoo / README < prev next >

Wrap

Text File | 1993-01-07 | 12.0 KB | 289 lines

README for the Linus extended file system defragmenter edefrag emergency release 0.3b alpha Copyright Stephen C. Tweedie, 1992, 1993 (sct@dcs.ed.ac.uk) Parts Copyright Remy Card, 1992 (card@masi.ibp.fr) Parts Copyright Linus Torvalds, 1992 (torvalds@kruuna.helsinki.fi) This file and the accompanying program may be redistributed under the terms of the GNU General Public License. INTRODUCTION: What does it do? ============================== As a file system is used, data tends to become more and more scattered over the disk, degrading performance. A disk defragmenter simply reorganises the data on the disk, so that individual files occupy a single sequential set of disk blocks, and all the free space on the disk is collected together in a single region. This generally means that reading a whole file is more efficient. The extended file system stores a list of unused disk blocks in a series of unused blocks scattered over the disk (the "free list"). When blocks are required to store data, they are removed from the head of the list, and are added back when released (by unlinking or truncating a file). However, only the free blocks stored at the head of the list are available to the extfs at any time. This means that not all the free space is known to the extfs when it tries to find a free block; as a result, it does not always find the most efficient way to use free space. This is in contrast to the minix file system, in which free space is stored in a single bitmap, and the file system can allocate free space from anywhere on the disk. The resulting poorer performance over time of the extended file system is unfortunate, because the larger partitions and longer filenames it supports are useful to have around. So, here is the extended file system defragmenter - recover all that lost performance from your extfs partition. For an idea of the performance gains you might obtain - the first time I defragmented my file system, the time taken to boot my PC (from switching on until the XDM X windows login prompt stabilises) dropped from 37 seconds to 27 seconds. As for the performance of the defragmenter itself - well, that first version worked, but it thrashed my hard disk solid for over an hour (this was for a 90MB partition). The current version runs in not much over 5 minutes now, and most of the accesses are sequential (ie. NO thrashing). Granted that the fragmentation is not severe any longer, but that 5 or 6 minutes does still include reading and writing over 70MB of the partition. Note - as of release 0.3, minix file systems are also supported. HOW TO USE: and a few warnings. =============================== Number one - (this applies to all - repeat, ALL - major file system operations). *** BACK UP ANY IMPORTANT DATA BEFORE YOU START. *** There may be bugs in the defragmenter. You may have undetected errors on your disk which are undiscovered until edefrag tries to write to a bad block which has never been accessed before. There may be power glitches, memory glitches, kernel errors. [e]defrag does some major reorganisation of disk data, and if for any reason it doesn't finish its work, most of your file system is likely to be trashed. *** YOU HAVE BEEN WARNED. *** *** NEVER try to defragment an active or mounted file system. It is often safe to use [e]fsck on a mounted fs; don't be conned into thinking that the same will work for [e]defrag. The file system will be totally unusable while [e]defrag is working; and if this causes a kernel crash, or if the fs interferes with the defragmenter as it runs, you may well loose your entire partition. This means that in order to defragment a root partition, you will probably need to run [e]defrag from a boot floppy. However, it IS totally safe to run [e]defrag in its readonly mode (for testing) on an active partition. *** Run [e]fsck on the partition first, to check its integrity. Although I have been quite careful about the defragmenter's behaviour on a corrupt file system (it should back down gracefully before doing anything irreversible), it may well cause a lot of damage if the file system is invalid in any way. In particular, there is currently no handling of read/write errors in the defragmenter. The extfs version DOES understand the bad block inode (and the special handling now works - as of version 0.3b), so if you suspect you might have bad blocks, try running efsck -t (test for bad blocks) before defragmenting. However, if you have an IDE drive, you needn't worry; you should never get any hd errors, as IDE drives dynamically remap bad blocks internally, as they occur. Until I have proper bad block support for minix, it's probably unwise to try to defragment a suspect, non-IDE minix partition. *** Run [e]defrag -r next, just to be sure. If there are any bugs in the defragmenter, running in readonly mode first may find them ([e]defrag does quite a lot of self-checking as it goes) before you lose any data. *** Reinstall lilo after defragmenting a bootable partition. Defragmentation moves data around the disk. edefrag knows all of the file system's internal pointers to this data, so these are adjusted as needed to keep the file system intact. Lilo, unfortunately, keeps its own pointers to the location of kernel image files, so that the kernel can be loaded before the filing system is running. (These pointers are usually kept in /etc/lilo/map.) If you defragment a partition containing a lilo-bootable kernel image, you MUST reinstall lilo to rebuild the now-invalid map file. Usage: edefrag [-Vdrsv] [-p pool_size] /dev/name -V : Prints the full CVS version id for the release. Send me this information with any problem reports or suggestions. -s : Show superblock information. -v : Verbose. Shows what the program is doing. If used twice, gives extra progress information. -r : Readonly. This opens the file system in readonly mode, which guarantees that your data will not be harmed. This can be useful for testing purposes, especially for working out the best buffer pool size to use. -d : (If enabled at compile-time) Debug mode. The pool_size is the number of 1KB (disk block) buffers to allocate to the buffer pool while relocating the file system data. (Default is 512; it cannot be set below 20.) Finally, /dev/name should be the device to be defragmented; an image file may also be used (for debugging purposes), as edefrag does not check that the file is a block device. HINTS ===== You may want to experiment with edefrag to find the best memory usage before defragmenting. Currently, the significant tables held in memory by edefrag are: Relocation maps - eight bytes per block. Inode table - 64 bytes per inode. Inode maps - 8 bytes per inode. The buffer pool must be added on top of this. For a typical file system, this works out at around 26K of memory required per MB of disk space, or 2.6MB memory for a 100MB disk partition; plus the buffer pool. It is safe to use a swap file or partition if memory is tight (but NOT one on the file system being defragmented!); this may not even affect performance much, since during its first (mapping) phase, the defragmenter accesses the inode table but not the buffer pool; during the second (relocating) phase, the inode table is unused and the buffer pool comes into play. (Don't worry about the defragmenter suddenly running out of memory during its work; all the memory required is allocated and initialised before it starts operation, so any memory errors should occur before the file system gets touched.) The defragmenter tries as hard as possible to group reads and writes into long sequential accesses. Data being overwritten on the disk gets put into a rescue buffer, and may soon just get written back during the normal course of sequential writes. However, if the buffer pool is too small or the disk is highly fragmented, edefrag tries to clear out the rescued data by seeing if its final destination is empty yet. (These are termed "migrate" writes; the data migrates from the rescue pool to the output pool.) If that fails to free enough space, edefrag forces some of the rescue buffers out into empty blocks ("forcing" writes), from which the data will have to be re-read at some point. The upshot of this is that normal buffer writes are highly sequential and efficient; "migrate" writes are slightly less sequential, but still quite efficient; and "forcing" writes cause data to be read twice, and from this point of view are quite inefficient. Running edefrag with the -r option will scan your file system non-destructively, and will report on the work it would have to do to defragment the disk. This facility can be used to adjust the pool size requested to compromise between memory used and defragmenting efficiency. For example, I have just run: $ edefrag -r /dev/hda3 [ default 512K buffer pool ] [ ... superblock statistics deleted ... ] Relocation statistics: 44807 buffer reads in 91 groups, of which: 14004 read-aheads. 44807 buffer writes in 91 groups, of which: 0 migrations, 0 forces. $ edefrag -r -p 100 /dev/hda3 [ ... superblock statistics deleted ... ] 45299 buffer reads in 618 groups, of which: 13310 read-aheads. 45299 buffer writes in 618 groups, of which: 202 migrations, 492 forces. The first result indicates a higher efficiency with 512 buffers than with 100. However, even the second run would have been quite quick; 492 forces out of a 90MB file system is not bad. (By the way, the reason the total number of writes is less than 90MB is that much of my hard disk was fully defragmented anyway. 8-) If, however, my disk had been badly fragmented (as it used to be...) I would probably have had to allocate around 2000-4000 buffers to get good efficiency with few forced writes. The tradeoff is that the less memory you allocate for pool buffers, the more is available for the kernel to cache reads itself. Since the kernel reads entire tracks at a time, leaving space to the kernel effectively gives extra "free" buffer reads. I'm not yet quite sure whether it is more efficient to leave the kernel with a healthily large cache for itself, or to allocate as much for edefrag's own (more optimised for the task) buffering scheme. You may want to experiment here, and I would be interested in hearing any conclusions you reach. I am running with 16MB ram, so if you have less ram your mileage may vary. WARRANTY: ========= NONE. Use at your own risk. BACK UP ANY IMPORTANT DATA BEFORE YOU START. I have successfully run edefrag on my own root, 90MB extfs partition at home. It has been tested on particularly hard jobs, such as defragmenting a 1.44MB floppy with a buffer pool restricted to 20KB - lots of extra writes are necessary to cope with a tiny buffer pool. This release has never crashed for me, and has never lost me any data. I am confident enough to use it fairly regularly, and if I back up data before using it, I only backup stuff which cannot be reinstalled from other sources. I have tried as far as possible to ensure that edefrag will not harm your data. However, I cannot make ANY guarantee that it won't. Use it and enjoy it, but don't blame me if it ruins your day. Having said that, if you DO have problems, let me know and I'll try to fix them for the next release. (Even better, send me bug fixes!) TO DO: ====== There is currently NO minix file system support. Watch this space. When the mark 2 extfs is released by Remy Card, I should support that, too. I currently read in the entire inode table before starting, and write it out again at the end. This is really a throw-back to edefrag's origins in efsck. Since I no longer access the inodes at all after initially calculating the disk relocation maps, I could probably get away with just accessing inode data as needed, so using less memory. Otherwise, try sharing memory between the inode table and the buffer pool, since the two are never used at the same time. The verbose (-v) option could do with a little rationalisation, and an interactive (maybe full screen?) mode showing progress would be nice. The sync() frequency should probably be configurable at run-time. === Stephen Tweedie (sct@dcs.ed.ac.uk).